Using Som and Lvq for Hmm Training 1.1.1 the Segmental Som Training
نویسنده
چکیده
1.1 New training methods for the HMMs The training of the context-independent phoneme models for a minimal recognition error rate is diicult, because the variability of the phonemes in diierent conditions and contexts is substantial and the output densities of diierent phonemes do also overlap. A structure that can automatically adapt to all the complicated density functions, has a vast number of parameters and for proper estimation, the quality and quantity of the available training data is crucial. The size of the models and the training database demand robustness to the initial parameter values in order to avoid an excessively large number of training epochs and long training times. The problem in practice with the widely spread training algorithms such as the segmental K-means (SKM) 8]and the segmental Generalized Probabilistic Descent (SGPD) 1]is that they sometimes converge slowly to low error rates unless good initial models are available. Several common initialization methods have been compared for the mixture density hidden Markov models (MDHMM). The best results in terms of quickly obtained low nal error rates in the automatic speech recognition (ASR) tests were obtained by using the Self-Organizing Maps (SOM) 2]to rst train phoneme dependent code-books and then use the codebook vectors as kernel centroids for the mixture densities. If the Learning Vector quantization (LVQ) 2]is used in the training after the SOMs, small improvements in the initialization can be achieved, but the SOM training can be performed much faster, because each phoneme codebook can be individually trained as a small SOM. The developed segmental SOM training for the HMMs 5]resembles to the conventional SKM type Viterbi training, but the main diierence is that the parameters of mixtures belonging to the neighborhood of the best-matching component are also adapted. The motivation for the neighborhood adaptation is the parameter smoothing, where the level of the smoothing compared to the tting accuracy to the training data is controlled by the neighborhood size. A wide neighborhood at the beginning ensures also that all the available codebook units will be drawn into useful regions in the input space. Compared to the codebooks trained without smoothing (e.g. by SKM) the accuracy provided by the best-matching Gaussian is usually worse, but that of the next (K ? 1)-best matches will be better, however, providing generalization for slightly discrepant characteristics of the test data. The motivation to have ordered density codebooks is to enable accelerated state pdf …
منابع مشابه
Self-organization in mixture densities of HMM based speech recognition
In this paper experiments are presented to apply Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) for training mixture density hidden Markov models (HMMs) in automatic speech recognition. The decoding of spoken words into text is made using speaker dependent, but vocabulary and context independent phoneme HMMs. Each HMM has a set of states and the output density of each state is...
متن کاملComparison results for segmental training algorithms for mixture density HMMs
This work presents experiments on four segmental training algorithms for mixture density HMMs. The segmental versions of SOM and LVQ3 suggested by the author are compared against the conventional segmental K-means and the segmental GPD. The recognition task used as a test bench is the speaker dependent, but vocabulary independent automatic speech recognition. The output density function of each...
متن کاملUsing the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs
This paper presents methods to improve the probability density estimation in hidden Markov models for phoneme recognition by exploiting the Self-Organizing Map (SOM) algorithm. The advantage of using the SOM is based on the created approximative topology between the mixture densities by training the Gaussian mean vectors used as the kernel centers by the SOM algorithm. The topology makes the ne...
متن کاملUsing PCA with LVQ, RBF, MLP, SOM and Continuous Wavelet Transform for Fault Diagnosis of Gearboxes
A new method based on principal component analysis (PCA) and artificial neural networks (ANN) is proposed for fault diagnosis of gearboxes. Firstly the six different base wavelets are considered, in which three are from real valued and other three from complex valued. Two wavelet selection criteria Maximum Energy to Shannon Entropy ratio and Maximum Relative Wavelet Energy are used and compared...
متن کاملUsing Self-organizing Maps and Learning Vector Quantization for Mix- Ture Density Hidden Markov Models Using Self-organizing Maps and Learning Vector Quanti- Zation for Mixture Density Hidden Markov Models. Acta Polytechnica
Thesis for the degree of Doctor of Technology to be presented with due permission for public examination and criticism in Auditorium F1 of the Helsinki University of Technology on the 3rd of October, at 12 o'clock noon. ABSTRACT This work presents experiments to recognize pattern sequences using hidden Markov models (HMMs). The pattern sequences in the experiments are computed from speech signa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997